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Abstract 

How organisms adapt to different climate habitats is a key question in evolu- 
tionary ecology and biological conservation. Species distributions are often 
determined by climate suitability. Consequently, the anthropogenic impact on 
earth's climate is of key concern to conservation efforts because of our relatively 
poor understanding of the ability of populations to track and evolve to climate 
change. Here, we investigate the ability of Arabidopsis thaliana to occupy cli- 
mate space by quantifying the extent to which different climate regimes are 
accessible to different A. thaliana genotypes using publicly available data from a 
large-scale genotyping project and from a worldwide climate database. The 
genetic distance calculated from 149 single-nucleotide polymorphisms (SNPs) 
among 60 lineages of A. thaliana was compared to the corresponding climate 
distance among collection localities calculated from nine different climatic fac- 
tors. A. thaliana was found to be highly labile when adapting to novel climate 
space, suggesting that populations may experience few constraints when adapt- 
ing to changing climates. Our results also provide evidence of a parallel or con- 
vergent evolution on the molecular level supporting recent generalizations 
regarding the genetics of adaptation. 



doi: 10.1002/ece3.622 

Introduction 

Climate is one of the most important factors determining 
the distribution of plants (Walther 2003) and therefore 
adaptation to climate should be a major selective force. 
Furthermore, the ability to adapt to climate heterogeneity 
can facilitate or constrain the dispersal of organisms, 
affecting species range (Angert et al. 2011), and climate 
adaptation may even play an important role in speciation 
(Keller and Seehausen 2012). Although historically local 
climates have been known to fluctuate across space and 
time at an ecological scale, human impacts are accelerat- 
ing climate change and this has already affected the sur- 
vival and distribution of some organisms (Parmesan and 
Yohe 2003; Parmesan 2006). The effects of climate change 
are expected to increase in the future (Hancock et al. 
2011). Thus, the ability to adapt to different climate 
regimes will likely be an important factor in the persis- 
tence of populations and species. This is especially true of 
plants, which are sessile and less able to disperse to more 
favorable climates as climate change occurs. 



Of particular interest is how labile populations are with 
respect to climate adaptation. That is, how easily are they 
able to expand their range into novel climate space, and 
how readily are they able to respond to climate shifts in 
their own range? The ability to predict the evolutionary 
dynamics that will result from widespread climate change 
will inform both conservation efforts and basic evolution- 
ary theory (Bradshaw and Holzapfel 2001; Olsen et al. 
2004; Teplitsky et al. 2008; Kearney et al. 2009; Hoffman 
and Sgro 2011; Hansen et al. 2012). 

Studies of the effect of climate on species ranges have a 
long history in plant ecology and evolution (see e.g., Darwin 
1859, ch. 11). Furthermore, there is extensive evidence for 
ecotypic variation within species that contributes to climate 
adaptation (e.g., Clausen 1926; Clausen et al. 1947; Lowry 
and Willis 2010). Although plasticity does play a role (Nico- 
tra et al. 2010), the overall picture is that there is a signifi- 
cant genetic contribution to climate adaptation. 

Large, publicly available data sets provide a wealth of 
information for genetic studies. Climate data are also 
widely available. Given that climate is a significant selec- 
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tive pressure, when populations have resided in a locality 
for a considerable time (number of generations), it is rea- 
sonable to assume that they have adapted to the local 
conditions. Therefore, combining such large-scale data 
sets allows researchers to estimate adaptation to climate 
on a greater scale than would be possible using experi- 
mental methods (Banta et al. 2012). 

The mouse-eared cress Arabidopsis thaliana is an ideal 
candidate for such a study. A. thaliana exhibits an annual 
life-history strategy with a cosmopolitan distribution across 
a wide range of habitat types. As a model organism for 
genetic studies, A. thaliana strains from many different cli- 
mate regimes have been extensively geno typed (Shindo et al. 
2007). Climate is known to be an important feature affecting 
fitness of A. thaliana (Wilczek et al. 2009; Fournier-Level 
et al. 2011). Climate regimes have been experimentally 
shown to predict performance under common garden con- 
ditions (Hoffmann et al. 2005; Rutter and Fenster 2007). 
Finally, climate has been shown to be an important factor 
limiting the distribution of A. thaliana (Hoffmann 2002). 
Although only a few loci contributing to climate adaptation 
have been well studied, the emerging picture is that climate 
adaptation in A. thaliana is affected by a vast network of 
genes affecting traits such as tolerance to temperature (Wes- 
terman 1971) and drought (McKay et al. 2003). Loci related 
to climate adaptation have been found to be widespread 
throughout the genome by a genome scan (Hancock et al. 
2011) and a recent study found a correlation between cli- 
mate and particular nonsynonymous substitutions at the 
genomic level (Lasky et al. 2012). Despite this, few studies 
have empirically examined adaptation to climate in natural 
A. thaliana populations due in large part to the difficulty in 
conducting field studies across a large sample of populations 
(but see Agren and Schemske 2012). When environmental 
factors can be correlated with fitness, relying on publicly 
available environmental and genetic data allows for more 
comprehensive studies. 

Here, we quantify whether the genotype of an ecotype 
is a useful predictor of the climate habitat it occupies. On 
the basis of earlier studies (Wilczek et al. 2009; Lasky 
et al. 2012), we expect the relationship between genetic 
distance and climate distance to be positive. However, 
how strong shared evolutionary lineage determines the 
ability to invade climate space is key to our understand- 
ing the lability of populations to adapt to climate. If there 
is a weak relationship between genetic relatedness and 
occupied climate space then it would suggest that there 
are multiple ways that a lineage can adapt to a particular 
climate regime, indicating high lability in the ability of 
this organism to adapt to climate. This question is highly 
relevant given the current state of drastic anthropogenic 
climate change. If the relationship between climate space 
and genotype space is limited, then it bodes ill for organ- 
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isms like plants that may be restricted in their ability to 
escape unsuitable habitat. 

Methods 

We used a large genetic data set from A. thaliana and a 
worldwide climate database to examine the relationship 
between genetic relatedness and occupied climate space. 
To compile data on a substantial number of ecotypes and 
to generate a genetic distance matrix, we took advantage 
of publically available data from a large-scale genotyping 
study (Borevitz lab: http://www.naturalvariation.org/ 
hapmap). The A. thaliana accessions that we used were 
taken from 853 lines characterized at 149 SNPs. It was 
important to have evidence that the accessions had expe- 
rienced the local climate for long enough to adapt to 
their collection climate locality. Thus, we attempted to 
only use accessions that were collected from less anthro- 
pogenically disturbed habitats (i.e., not roadsides) typical 
of A. thaliana's natural habitat where they were more 
likely to have a relatively long history, and consequently 
enough time to adapt to local climatic conditions. Such 
habitats include steep rocky slopes, open areas near forest 
(but not in understory), and open habitats with sandy or 
limited soil. We included these habitats as well as habitats 
that reflect some human disturbance including fallow 
fields, rocky walls, cemeteries with sandy soil, and etc. In 
response to reviewers' suggestions we added a further 67 
accessions that from the habitat descriptions appeared to 
be from more anthropogenically disturbed sites including 
roadsides, tourist parks, fields under active cultivation, 
railway ballasts, and etc. (Appendix B). However, the vast 
majority of lines had no habitat data and were omitted 
immediately. Of the rest, several were collected outside of 
the native range of A. thaliana (e.g., in North America) 
and several more were not genotyped at the majority of 
the SNPs. In addition, we excluded such habitats as 
"Botanic Garden" or any university-associated sites, as 
those plants may represent escaped accessions adapted to 
other localities. This resulted in a data set consisting of 
60 accessions (Appendix A) that derived from what we 
consider the least anthropogenically disturbed habitats 
and 67 more accessions from sites that might reflect 
higher anthropogenic disturbance (Appendix B). 

To generate a climate distance matrix among the 60 
and 67 accessions, we compiled climate data for each 
locality from a database consisting of nine different cli- 
mate factors recorded every 10 degree minutes worldwide. 
The closest recorded point to the collection site of each 
A. thaliana line was used for this study. In some cases this 
created overlap in the site data for certain accessions. 
While this data set does not capture what may be impor- 
tant microclimate variation, it was the most precise data 
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available to us. Given that previous studies have demon- 
strated a genetic contribution to climate adaptation, we 
believe that this will provide a conservative measure of 
the lability of genetic adaptation to climate, as genotypes 
from populations in the same climate regime would be 
expected to increase the correlation between genotype 
and habitat climate. Data for eight of the climate factors 
were collected monthly. These were precipitation (pre), 
number of wet days (wet), mean temperature (tmp), 
mean diurnal temperature range (dtr), relative humidity 
(reh), sunshine (sunp), ground frost (frs), and 10-m wind 
speed (wnd). The ninth was elevation and consisted of a 
single measure for each location. We included all available 
climate factors to avoid any a priori assumptions about 
which factors were most important. We ran additional 
analyses on a selected subset of the data (mean tempera- 
ture from November to June and precipitation from June 
to August). These factors were selected using Banta et al. 
(2012) as a guide. This reduced the partial Mantel corre- 
lation slightly and to a nonsignificant degree. We there- 
fore included all climate factors in the final analysis. The 
data are from New et al. (2002) and can be downloaded 
from Climate Research Unit website (http://www.cru.uea. 
ac.uk/cru/data/tmc.htm) . 

To compare climate distance to genetic distance we calcu- 
lated distance matrices for both genotype (SNPs) and climate. 
The genetic distance matrix was calculated using DNADIST 
from the PHYLIP package (Felsenstefn 1989) using the F84 
substitution model. The climate distance matrix was calcu- 
lated using PROC DISTANCE METHOD=DGOWERS in 
SAS 9.1.3 (SAS Institute 2004). This is Gower's environmen- 
tal distance metric (Cower 1971). 

To compare the genetic distance matrix to the climate 
distance matrix using tree-based methods, we estimated 
neighbor-joining trees for each distance matrix using the 
program NEIGHBOR in the PHYLIP package (Felsenstein 
1989). We then calculated the Robinson-Foulds tree dis- 
tance metric using TREEDIST from the PHYLIP package 
(Felsenstein 1989). This metric measures the dissimilarity 
among the overall topology of two or more unrooted 
trees (Robinson and Foulds 1981). Smaller numbers indi- 
cate higher similarity among topologies. The scale of the 
metric ranges from 0 (total concordance) to 2n — 6, 
where n is the number of terminal nodes. In the situation 
where we used the 60 accessions, then the maximum 
Robinson-Foulds index would be 2 (60) — 6 = 114. We 
then used the program topd/fMtS (Puigbo et al. 2007) to 
calculate a Robinson-Foulds metric among a set of ran- 
domized trees. This number is expected to reflect low 
concordance due to random topologies. 

We expected that geographic distance could inflate the 
relationship between genetic and climate distance because 
closely related genotypes are expected to share geographic 



locales and hence similar climates (Beck et al. 2008). Thus, 
to remove the confounding influence of geographic proxim- 
ity, we calculated an additional matrix of geographic great 
circle distance in R (R Development Core Team 2011). We 
used the VEGAN (Oksanen et al. 2011) package in R to cal- 
culate the partial Mantel correlation (Mantel 1967) between 
the genetic distance matrix and the climate distance matrix 
controlling for the geographic distance matrix for both the 
60 least disturbed and 67 moderately disturbed accessions 
separately and together. Mantel and partial Mantel tests are 
commonly used in ecology to study the relationship between 
ecological factors and genetic distance (Smouse et al. 1986). 
VEGAN calculates three correlation measures: Pearson's 
product moment correlation, Spearman's rank correlation, 
and Kendall's rank correlation. AH R scripts can be found in 
the online supporting information. 

Results 

The genetic distance matrix (SNP) and the climate dis- 
tance matrix for the 60 accessions collected from less dis- 
turbed sites are both represented as neighbor-joining trees 
(Fig. 1). The Robinson-Foulds distance metric for the 
two neighbor-joining trees was 110, indicating very low 
concordance between trees. The least possible concor- 
dance is 2n — 6, 114 in this case. The calculated Robin- 
son-Foulds for a set of 100 randomized topologies for 
this data set is 113 with a 95% confidence interval of 
±0.2. Therefore, although the concordance between these 
two trees is very low, there is a small signal of lineage on 
occupied climate space. 

The results of the partial Mantel tests are presented in 
Table 1 for the 60 accessions collected from less disturbed 
sites that in our opinion more likely reflect native habitat. 
The partial Mantel correlations comparing the genetic 
distance matrix to the climate distance matrix and con- 
trolling for geographic distance were positive but low. 
Including the 67 moderately disturbed localities with the 
less disturbed (a total of 127 lineages) did not affect the 
ranked correlations from the partial Mantel test but 
reduced the Pearson correlation by about two thirds 
(from r = 0.23 to r = 0.07). When a partial Mantel test 
was conducted with the 67 lineages from the moderately 
disturbed localities alone, the Pearson correlation between 
genetic distance and climate distance was not significantly 
different from 0 (r = 0.02, P = 0.285). Therefore, we 
decided to base our conclusions only on the 60 accessions 
collected from less disturbed localities. 

As an additional visualization we include a scatter plot 
of pairwise climate distance by pairwise genetic distance for 
the 60 accessions that shows a positive but low correlation, 
with the majority of the points reflecting high genetic dis- 
tance coupled with low climate distance (Fig. 2). We 
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Figure 1. Neighbor-joining trees from 
NEIGHBOR in PHYLIP. (A) Tree reconstructed 
from a genetic distance matrix for Arabidopsis 
thaliana lines from across Europe and Asia. 
Pairwise genetic distance was calculated from 
149 SNPs for 60 A. thaliana lines. (B) Tree 
reconstructed from climate data matrix for the 
habitat of each A. thaliana line. Pairwise 
climate Gower's distance was calculated from 
nine climate factors for 60 collection localities. 
Lines joining the two trees indicate which 
genotype (A) inhabits which climate space (B). 
There was overlap in collection sites for the 60 
A. thaliana lines. 



Table 1. The results of partial Mantel test estimating the correlation 
between genetic distance from 60 Arabidopsis thaliana lines and 
climate distance from their collection localities across Europe and Asia 
and controlling for geographic distance. 





Partial mantel test 


P 


Pearson correlation (r) 


0.2389 


0.001 


Spearman rank correlation (p) 


0.07039 


0.029 


Kendall rank correlation (t) 


0.06944 


0.003 



Pairwise genetic distance was calculated from 149 SNPs. Pairwise 
climate Gower's distance was calculated from nine climate factors for 
each collection locality. There was overlap in some localities. The 
results are presented for Pearson correlation, Spearmen rank correla- 
tion, and Kendall rank correlation. 

acknowledge that pseudoreplication is a concern with this 
presentation and we do not base any of our formal analyses 
on this figure. It is included solely for illustrative purposes. 

Discussion 

We demonstrate positive but low concordance between 
genetic relatedness in A. thaliana populations and the cli- 
mate space that those populations inhabit. Both the 
partial Mantel tests and the Robertson-Foulds index indi- 
cate that genetic relatedness has little explanatory power 
in predicting the climate in which a genotype will be 
found. We interpret our results to mean that similar A. 
thaliana genotypes are able to occupy different climate 
regimes and that different genotypes have access or the 
ability to evolve to similar climate regimes. Therefore, 




0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

Pairwise Genetic Distance 

Figure 2. Scatter plot of pairwise climate data (Gower's distance) 
versus pairwise genetic data for 60 Arabidopsis thaliana accessions 
collected across its native range. As these data suffer from 
pseudoreplication, we did not include it in our formal analyses and 
only include it for illustrative purposes. The figure is divided into four 
quadrants representing general relationships between climate distance 
and genetic distance. They are (A) high climate distance and low 
genetic distance, (B) high climate distance and high genetic distance, 
(C) low climate distance and low genetic distance, and (D) low 
climate distance and low genetic distance. The majority of the points 
reflect a correlation between low climate distance and high genetic 
distance. 

access to different climate spaces appears to be relatively 
unconstrained by the A. thaliana genotype. This is also 
seen in Figure 2 where a number of genotypes have zero 
genetic distance based on the survey of 149 SNP's and yet 
occupy a wide range of climate regimes. 
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Hoffmann (2005) examined the evolution of climate 
adaptation in the genus Arabidopsis using phylogenetic 
reconstruction with climate space as a character. The 
analyses determined the core climate space (the climate 
space where all studied taxa coexist) and the realized 
climate niche (the intersection of taxa distribution ranges 
and climate data) of the genus. Hoffmann concluded that 
there was a high degree of parallel evolution to climate 
across the genus. Here, we demonstrate this same 
phenomenon within a species. 

The ability of different genotypes to access similar cli- 
mate habitats, and vice versa may help explain how A. 
thaliana has been able to achieve a cosmopolitan distribu- 
tion in such a short period (e.g., across North America in 
approximately the last 150-200 years) (Vander Zwan 
et al. 2000; Jorgensen and Mauricio 2004). Given its 
annual life history it is possible that populations have 
adapted to climate on the order of tens to hundreds of 
generations. Recent range expansion in A. thaliana is cer- 
tainly a result of human interference, but it is believed 
that the dispersal of self-fertilizing seed colonists has been 
the most important force in the history of the species. 
The ability of these annual colonists to rapidly adapt to 
novel climate habitats likely facilitated this process (Samis 
et al. 2012). Genetic adaptation to climate may exhibit 
the pattern we found due to parallel evolution or conver- 
gent evolution. In the former, the same genetic changes 
occur independently. In the latter, different genetic 
changes occur but the end result is the same. Several 
greenhouse and field studies have found a high beneficial 
mutation rate in A. thaliana, with as many as 50% of 
new mutations conferring a fitness advantage (Shaw et al. 
2000; MacKenzie et al. 2005; Rutter et al. 2010, 2012). 
Thus, the independent adaptation to similar climates by 
genotypes that are not directly related may be due to the 
contribution of new beneficial mutations. 

We foresee two potential concerns for our interpreta- 
tion of independent adaptation to climate space. The first 
is the possibility that A. thaliana is phenotypically plastic 
with regard to climate space. Although this likely plays 
some role, we believe that our study also reflects genetic 
adaptation to climate. A reciprocal transplant study 
demonstrated genetic adaptation to climate between two 
European populations (Agren and Schemske 2012). Like- 
wise, a common garden experiment has shown that the 
success of an accession of A. thaliana in a particular habi- 
tat can be accurately predicted by the similarity of that 
habitat to the native habitat of the ecotype (Rutter and 
Fenster 2007), consistent with adaptive differentiation to 
climate. Finally, analysis of the 67 accessions from locali- 
ties we deemed that moderately disturbed showed no 
correlation between climate distance and genetic distance. 
We therefore feel our criteria for selecting localities were 



sufficiently conservative and reflect accessions that are 
likely to be locally adapted to their climate regime. 

The second concern is that our study may not have 
captured variation in the loci that are involved in climate 
adaptation. The 149 SNP markers that we used to con- 
struct the genetic distance matrix were not intended to be 
used to identify loci associated with climate adaptation, 
although it is likely that some were given that linkage dis- 
equilibrium estimates vary from 10- 250 kb (Nordborg 
et al. 2002, 2005; Kim et al. 2007). Rather our SNP -based 
genetic distance tree clearly demonstrates that genetic 
relatedness is not a strong predictor of the climate inhab- 
ited by the genotype. We do know that climate adapta- 
tion in A. thaliana involves the interaction of a large 
number of loci distributed throughout its genome 
(Wilczek et al. 2009). In a recent study using 214,051 
SNPs and 1003 accessions of A. thaliana, 15.7% of the 
genetic variation was found to be associated with climate 
(Lasky et al. 2012). Therefore, one would not expect all 
the same loci to be involved in the evolution to similar 
climates given the large number of loci so far identified 
to be associated with climate adaptation. 

Based on our results, it seems likely that parallel evolu- 
tion is common not just among species of Arabidopsis 
(Hoffmann 2005) but also within A. thaliana. An intrigu- 
ing generality from adaptation genetics studies is that, at 
the sequence level, parallel evolution may be more com- 
mon than once believed (Orr 2005a,b). If this is true, then 
species may not be as genetically constrained with regard 
to potential habitats, or adaptation to changing habitats. 
Our data indicate that A. thaliana is unconstrained with 
regard to climate adaptation within the range of climate in 
which it is found and this may account for its rapid 
cosmopolitan range expansion. Similar results were 
reported by Banta et al. (2012). These authors found that 
later flowering time restricted the niche breadth (measured 
by climate variables) of A. thaliana accessions as compared 
to earlier flowering accessions which were relatively 
unconstrained. Flowering time in their study could be 
controlled by any one of 12 different loci. That is, adapta- 
tion to climate could be affected by at least 12 different 
genetic pathways. 

Our findings have important implications to conserva- 
tion efforts that are responding to anthropogenic change 
(Bradshaw and Holzapfel 2001; Hoffman and Sgro 2011), 
suggesting that it may not be easy to predict which popu- 
lations have the ability to adapt to new climate regimes. 
Using information from ecological and genetic databases 
in conjunction with smaller scale field studies may there- 
fore be a useful way to generate results from a much lar- 
ger number of populations than is feasible using 
experimental methods, and may help shed light on the 
genetics of climate adaptation. 
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Appendix A. 



The 60 Arabidopsis thaliana accessions used in this study. 



Accession 


City 


Country 


Latitude 


Longitude 


Habitat 


CS28055 


Bayreuth 


Germany 


49.941598 


11.571146 


Fallow land 


CS28129 


Calver 


United Kingdom 


53.269942 


-1.642924 


Rocky limestone slope 


CS28133 


Champex 


Switzerland 


46.313743 


6.940206 


Dry loam 
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